Continuous Checkpointing: Joining the Checkpointing with Virtual Memory Paging
نویسندگان
چکیده
Checkpointing is a basic mechanism for backward error-recovery in fault-tolerant systems. A checkpointed process stops execution and saves its states to files periodically. To reduce the file sizes, only data modified between two consecutive checkpoint times is saved. However, existing approaches do not consider operating system paging activities; which, if ignored may double the number of disk accesses required to checkpoint non-residentdirty pages. In this paper,we propose continuous checkpointing,which combines the checkpoint facility with virtual memory paging operations. Thus, checkpointing is continuous during the lifetime of a process without extra overhead. Checkpoint size is no longer proportional to application size, but rather is bounded by resident dirty pages. Experimental results show that disk accesses can be reduced by about 80% when checkpointing large applications. 1997 by JohnWiley & Sons, Ltd.
منابع مشابه
An analysis of location record checkpointing interval for mobility database in PCS networks
Mobility database that stores the users’ location records is very important to connect calls to mobile users on personal communication networks. If the mobility database fails, calls to mobile users may not be set up in time. This paper studies failure restoration of mobility database. We study per-user location record checkpointing schemes that checkpoint a user’s record into a non-volatile st...
متن کاملStability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملAn Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment
Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...
متن کاملVM-μCheckpoint: Design, Modeling, and Assessment of Lightweight In-Memory VM Checkpointing
Checkpointing and rollback techniques enhance reliability and availability of virtual machines and their hosted IT services. This paper proposes VM-μCheckpoint, a light-weight pure-software mechanism for high-frequency checkpointing and rapid recovery for VMs. Compared with existing techniques of VM checkpointing, VM-μCheckpoint tries to minimize checkpoint overhead and speed up recovery by mea...
متن کاملA Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System
A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file system. Managing globally the data, they provide programmers of scientific applications with the attractive shared memory programming model combined with a large and efficient file system in a cluster. In this paper, we present a cheap and efficient two-level checkpointing approach enabling a PSL...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Softw., Pract. Exper.
دوره 27 شماره
صفحات -
تاریخ انتشار 1997